106
9
Probability and Likelihood
The reader can readily verify that a plot of bb versus kk is a hump whose central
term occurs at m equals left bracket left parenthesis n plus 1 right parenthesis p right bracketm = [(n + 1)p], where the notation left bracket x right bracket[x] signifies “the largest integer
not exceeding xx”.
An important practical case arises where nn is large and pp is small, such that the
product n p equals lamdanp = λ is of moderate size (tilde 1∼1). The distribution can then be simplified:
StartLayout 1st Row b left parenthesis k semicolon n comma p right parenthesis equals StartBinomialOrMatrix n Choose k EndBinomialOrMatrix StartFraction lamda Over n EndFraction Superscript k Baseline left parenthesis 1 minus StartFraction lamda Over n EndFraction right parenthesis Superscript n minus k Baseline equals StartFraction lamda Superscript k Baseline Over k factorial EndFraction left parenthesis 1 minus StartFraction lamda Over n EndFraction right parenthesis Superscript n minus k Baseline StartFraction n left parenthesis n minus 1 right parenthesis midline horizontal ellipsis left parenthesis n minus k plus 1 right parenthesis Over n Superscript k Baseline EndFraction period EndLayoutb(k; n, p) =
(n
k
)λ
n
k (
1 −λ
n
)n−k
= λk
k!
(
1 −λ
n
)n−k n(n −1) · · · (n −k + 1)
nk
.
Now, left parenthesis 1 minus lamda divided by n right parenthesis Superscript n minus k Baseline almost equals e Superscript negative lamda(1 −λ/n)n−k ≈e−λ and n left parenthesis n minus 1 right parenthesis period period period left parenthesis n minus k plus 1 right parenthesis divided by n Superscript k Baseline almost equals 1n(n −1)...(n −k + 1)/nk ≈1; hence,
b left parenthesis k semicolon n comma p right parenthesis almost equals StartFraction lamda Superscript k Baseline Over k factorial EndFraction e Superscript negative lamda Baseline equals p left parenthesis k semicolon lamda right parenthesis commab(k; n, p) ≈λk
k! e−λ = p(k; λ) ,
(9.28)
which is called the Poisson approximation to the binomial distribution. However, if
lamdaλ is fixed, thensigma summation p left parenthesis k semicolon lamda right parenthesis equals 1E p(k; λ) = 1; hence,p left parenthesis k semicolon lamda right parenthesisp(k; λ), the probability of exactlykk successes
occurring, is a distribution in its own right, called the Poisson distribution. It is of
great importance in nature, describing processes lacking memory.
The probability f left parenthesis k semicolon r comma p right parenthesis f (k;r, p) that exactly kk failures precede the rrth success (i.e.,
exactly kk failures among r plus k minus 1r + k −1 trials followed by success) is
f left parenthesis k semicolon r comma p right parenthesis equals StartBinomialOrMatrix r plus k minus 1 Choose k EndBinomialOrMatrix p Superscript r Baseline q Superscript k Baseline equals StartBinomialOrMatrix negative r Choose k EndBinomialOrMatrix p Superscript r Baseline left parenthesis negative q right parenthesis Superscript k Baseline comma k equals 0 comma 1 comma 2 comma ellipsis period f (k;r, p) =
(r + k −1
k
)
prqk =
(−r
k
)
pr(−q)k, k = 0, 1, 2, . . . .
(9.29)
Iff 9
sigma summation Underscript k equals 0 Overscript normal infinity Endscripts f left parenthesis k semicolon r comma p right parenthesis equals 1 comma
∞
E
k=0
f (k;r, p) = 1 ,
(9.30)
the possibility that an infinite sequence of trials produces fewer thanrr successes can
be discounted, since by the binomial theorem
sigma summation Underscript k equals 0 Overscript normal infinity Endscripts StartBinomialOrMatrix negative r Choose k EndBinomialOrMatrix left parenthesis negative q right parenthesis Superscript k Baseline equals p Superscript negative r Baseline comma
∞
E
k=0
(−r
k
)
(−q)k = p−r ,
(9.31)
which equals 1 when multiplied byp Superscript rpr. The sequencef left parenthesis k semicolon r comma p right parenthesis f (k;r, p) is called the negative
binomial distribution.
Example. Suppose that the normal rate of infection of a certain disease in cattle is
25%. 10 An experimental vaccine is injected intonn animals. If it is wholly ineffectual,
the probability that exactlykk animals remain free from infection isb left parenthesis k semicolon n comma 0.75 right parenthesisb(k; n, 0.75); for
k equals n equals 10k = n = 10, this probability is approximately 0.056; the probability that 1 animal
out of 17 becomes infected is slightly lower, approximately 0.050, and for 2 out of
9 If and only if.
10 Due to P. V. Sukhatme and V. G. Panse, quoted by Feller (1967), Chap. 6.